SwePub
Tyck till om SwePub Sök här!
Sök i SwePub databas

  Extended search

Träfflista för sökning "db:Swepub ;pers:(Jantsch Axel);mspu:(doctoralthesis)"

Search: db:Swepub > Jantsch Axel > Doctoral thesis

  • Result 1-10 of 10
Sort/group result
   
EnumerationReferenceCoverFind
1.
  • Al Khatib, Iyad, 1975- (author)
  • Performance Analysis of Application-Specific Multicore Systems on Chip
  • 2008
  • Doctoral thesis (other academic/artistic)abstract
    • The last two decades have witnessed the birth of revolutionary technologies in data communications including wireless technologies, System on Chip (SoC), Multi Processor SoC (MPSoC), Network on Chip (NoC), and more. At the same time we have witnessed that performance does not always keep pace with expectations in many services like multimediaservices and biomedical applications. Moreover, the IT market has suffered from some crashes. Hence, this triggered us to think of making use of available technologies and developing new ones so that the performance level is suitable for given applications and services. In the medical field, from a statistical viewpoint, the biggest diseases in number of deaths are heart diseases, namely Cardiovascular Disease (CVD) and Stroke. The application with the largest market for CVD is the electrocardiogram (ECG/EKG) analysis. According to the World Health Organization (WHO) report in 2003, 29.2% of global deaths are due to CVD and Stroke, half of which could be prevented if there was proper monitoring. We found in the new advance in microelectronics, NoC, SoC, and MPSoC, a chance of a solution for such a big problem. We look at the communication technologies, wireless networks, and MPSoC and realize that many projects can be founded, and they may affect people's lives positively, as for example, curing people more rapidly, as well as homecare of such large scale diseases. These projects have a medical impact as well as economic and social impacts. The intention is to use performance analysis of interconnected microelectronic systems and combine it with MPSoC and NoC technologies in order to evolve to new systems on chip that may make a difference. Technically, we aim at rendering more computations in less time, on a chip with smaller volume, and with less expense. The performance demand and the vision of having a market success, i.e. contributing to lower healthcare costs, pose many challenges on the hardware/software co-design to meet these goals. This calls upon the development of new integrated circuits featuring increased energy efficiency while providing higher computation capabilities, i.e. better performance. The biomedical application of ECG analysis is an ideal target for an application-specific SoC implementation. However, new 12-lead ECG analyses algorithms are needed to meet the aforementioned goals. In this thesis, we present two novel algorithms for ECG analysis, namely the Autocorrelation-Function (ACF) based algorithm and the Fast Fourier Transform (FFT) based algorithm. In this respect, we explore the design space by analyzing different hardware and software architectures. As a result, we realize a design with twelve processors that can compute 3.5 million arithmetic computations and respect the real time hard deadline for our biomedical application (3.5-4seconds), and that can deploy the ACF-based and FFT-based algorithms. Then, we investigate the configuration space looking for the most effective solution, performance and energy-wise. Consequently, we present three interconnect architectures (Single Bus, Full Crossbar, and Partial Crossbar) and compare them with existing solutions. The sampling frequencies of 2.2 KHz and 4 KHz, with 12 DSPs, are found to be the critical points for our Shared-Bus design and Crossbar architecture, respectively. We also show how our performance analysis methods can be applied to such a field of SoC design and with a specific purpose application in order to converge to a solution that is acceptable from a performance viewpoint, meets the real-time demands, and can be implemented with the present technologies while at the same time paving the way for easier and faster development. In order to connect our MPSoC solution to communication networks to transmit the medical results to a healthcare center, we come up with new protocols that will allow the integration of multiple networks on chips in a communication network. Finally, we present a methodology for HW/SW Codesign for application-specific systems (with focus on biomedical applications) that require a large number of computations since this will foster the convergence to solutions that are acceptable from a performance point of view.
  •  
2.
  • Eslami Kiasari, Abbas (author)
  • Performance Analysis and Design Space Exploration of On-Chip Interconnection Networks
  • 2013
  • Doctoral thesis (other academic/artistic)abstract
    • The advance of semiconductor technology, which has led to more than one billion transistors on a single chip, has enabled designers to integrate dozens of IP (intellectual property) blocks together with large amounts of embedded memory. These advances, along with the fact that traditional communication architectures do not scale well have led to significant changes in the architecture and design of integrated circuits. One solution to these problems is to implement such a complex system using an on-chip interconnection network or network-on-chip (NoC). The multiple concurrent connections of such networks mean that they have extremely high bandwidth. Regularity can lead to design modularity providing a standard interface for easier component reuse and improved interoperability.The present thesis addresses the performance analysis and design space exploration of NoCs using analytical and simulation-based performance analysis approaches. At first, we developed a simulator aimed to performance analysis of interconnection networks. The simulator is then used to evaluate the performance of networks topologies and routing algorithms since their choice heavily affect the performance of NoCs. Then, we surveyed popular mathematical formalisms – queueing theory, network calculus, schedulability analysis, and dataflow analysis – and how they have been applied to the analysis of on-chip communication performance in NoCs. We also addressed research problems related to modelling and design space exploration of NoCs.In the next step, analytical router models were developed that analyse NoC performance. In addition to providing aggregate performance metrics such as latency and throughput, our approach also provides feedback about the network characteristics at a fine-level of granularity. Our approach explicates the impact that various design parameters have on the performance, thereby providing invaluable insight into NoC design. This makes it possible to use the proposed models as a powerful design and optimisation tool.We then used the proposed analytical models to address the design space exploration and optimisation problem. System-level frameworks to address the application mapping and to design routing algorithms for NoCs were presented. We first formulated an optimisation problem of minimizing average packet latency in the network, and then solved this problem using the simulated annealing heuristic. The proposed framework can also address other design space exploration problems such as topology selection and buffer dimensioning.
  •  
3.
  • Jiang, Ke (author)
  • Security-Driven Design of Real-Time Embedded Systems
  • 2015
  • Doctoral thesis (other academic/artistic)abstract
    • Real-time embedded systems (RTESs) have been widely used in modern society. And it is also very common to find them in safety and security critical applications, such as transportation and medical equipment. There are, usually, several constraints imposed on a RTES, for example, timing, resource, energy, and performance, which must be satisfied simultaneously. This makes the design of such systems a difficult problem.More recently, the security of RTESs emerges as a major design concern, as more and more attacks have been reported. However, RTES security, as a parameter to be considered during the design process, has been overlooked in the past. This thesis approaches the design of secure RTESs focusing on aspects that are particularly important in the context of RTES, such as communication confidentiality and side-channel attack resistance.Several techniques are presented in this thesis for designing secure RTESs, including hardware/software co-design techniques for communication confidentiality on distributed platforms, a global framework for secure multi-mode real-time systems, and a scheduling policy for thwarting differential power analysis attacks. All the proposed solutions have been extensively evaluated in a large amount of experiments, including two real-life case studies, which demonstrate the efficiency of the presented techniques.
  •  
4.
  • Liu, Ming, 1982- (author)
  • Adaptive Computing based on FPGA Run-time Reconfigurability
  • 2011
  • Doctoral thesis (other academic/artistic)abstract
    • In the past two decades, FPGA has been witnessed from its restricted use as glue logic towards real System-on-Chip (SoC) platforms. Profiting from the great development on semiconductor and IC technologies, the programmability of FPGAs enables themselves wide adoption in all kinds of aspects of embedded designs. Modern FPGAs provide the additional capability of being dynamically and partially reconfigured during the system run-time. The run-time reconfigurability enhances FPGA designs from the sole spatial to both spatial and temporal parallelism, providing more design flexibility for advanced system features. Adaptive computing delegates an advanced computing paradigm in which computation tasks and resources are intelligently managed in correspondence with conditional requirements. In this thesis, we investigate adaptive designs on FPGA platforms: We present a comprehensive and practical design framework for adaptive computing based on the FPGA run-time reconfigurability. It concerns several design key issues in different hardware/software layers, specifically hardware architecture, run-time reconfiguration technical support, OS and device drivers, hardware process scheduler, context switching as well as Inter-Process Communications (IPC). Targeting a special application of data acquisition (DAQ) and trigger systems in nuclear and particle physics experiments, we set up the data streaming model and conduct theoretical analysis on the adaptive system. Three application studies are employed to verify the proposed adaptive design framework: The first application demonstrates a peripheral controller adaptable system aiming at general embedded designs. Through dynamically loading/unloading a NOR flash memory controller and an SRAM controller, both flash memory and SRAM accesses may be accomplished with less resource consumption than in traditional static designs. In the second case, two real algorithm processing engines are adaptively time-multiplexed in the same reconfigurable slot for particle recognition computation. Experimental results reveal the reduced on-chip resource requirements, as well as an approximate processing capability of the peer static design. Taking advantage of the FPGA dynamic reconfigurability, we present in the third application a novel on-FPGA interconnection microarchitecture named RouterLess NoC (RL-NoC). RL-NoC employs the novel design concept of Move Logic Not Data (MLND), and significantly distinguishes itself from the existing interconnection architectures such as buses, crossbars or NoCs. It does not rely on routers to deliver packets hop by hop as canonical NoCs do, but buffers data packets in virtual channels and brings various nodes using run-time reconfiguration to produce or consume them. In comparison with canonical packet-switching NoCs, the routerless architecture features lower design complexity, less resource consumption, higher work frequency, more efficient power dissipation as well as comparable or even higher packet delivery efficiency. It is regarded as a promising interconnection approach in some design scenarios on FPGAs, especially for light-weight applications.
  •  
5.
  • Lu, Zhonghai (author)
  • Design and Analysis of On-Chip Communication for Network-on-Chip Platforms
  • 2007
  • Doctoral thesis (other academic/artistic)abstract
    • Due to the interplay between increasing chip capacity and complex applications, System-on-Chip (SoC) development is confronted by severe challenges, such as managing deep submicron effects, scaling communication architectures and bridging the productivity gap. Network-on-Chip (NoC) has been a rapidly developed concept in recent years to tackle the crisis with focus on network-based communication. NoC problems spread in the whole SoC spectrum ranging from specification, design, implementation to validation, from design methodology to tool support. In the thesis, we formulate and address problems in three key NoC areas, namely, on-chip network architectures, NoC network performance analysis, and NoC communication refinement. Quality and cost are major constraints for micro-electronic products, particularly, in high-volume application domains. We have developed a number of techniques to facilitate the design of systems with low area, high and predictable performance. From flit admission and ejection perspective, we investigate the area optimization for a classical wormhole architecture. The proposals are simple but effective. Not only offering unicast services, on-chip networks should also provide effective support for multicast. We suggest a connection-oriented multicasting protocol which can dynamically establish multicast groups with quality-of-service awareness. Based on the concept of a logical network, we develop theorems to guide the construction of contention-free virtual circuits, and employ a back-tracking algorithm to systematically search for feasible solutions. Network performance analysis plays a central role in the design of NoC communication architectures. Within a layered NoC simulation framework, we develop and integrate traffic generation methods in order to simulate network performance and evaluate network architectures. Using these methods, traffic patterns may be adjusted with locality parameters and be configured per pair of tasks. We propose also an algorithm-based analysis method to estimate whether a wormhole-switched network can satisfy the timing constraints of real-time messages. This method is built on traffic assumptions and based on a contention tree model that captures direct and indirect network contentions and concurrent link usage. In addition to NoC platform design, application design targeting such a platform is an open issue. Following the trends in SoC design, we use an abstract and formal specification as a starting point in our design flow. Based on the synchronous model of computation, we propose a top-down communication refinement approach. This approach decouples the tight global synchronization into process local synchronization, and utilizes synchronizers to achieve process synchronization consistency during refinement. Meanwhile, protocol refinement can be incorporated to satisfy design constraints such as reliability and throughput. The thesis summarizes the major research results on the three topics.
  •  
6.
  • Millberg, Mikael, 1973- (author)
  • Architectural Techniques for Improving Performance in Networks on Chip
  • 2011
  • Doctoral thesis (other academic/artistic)abstract
    • The main aim of this thesis is to propose enhancing techniques for the performance in Networks on Chips. In addition, a concrete proposal for a protocol stack within our NoC platform Nostrum is presented. Nostrum inherently supports both Best Effort as well as Guaranteed Throughput traffic delivery. It employs a deflective routing scheme for best effort traffic delivery that gives a small footprint of the switches in combination with robustness to disturbances in the network. For the traffic delivery with hard guarantees a TDMA based scheme is used. During the transmission process in a NoC several stages are involved. In the papers included, I propose a set of strategies to enhance the performance in several of these stages. The strategies are summarised as follows Temporally Disjoint Networks is that a physical network, potentially, can be seen to contain a set of separate networks that a packet can enter dependenton when it enters the physical network. This has the consequence that wecould have different traffic types in the different networks. Looped containers provide means to set up virtual circuits in networksusing deflective routing. High priority container packets are inserted intothe network to follow a predefined, closed, route between source and destination.At sender side the packets are loaded and sent to the destination where it is unloaded and sent back. Proximity Congestion Awareness reduces the load of the network by diverting packets away from congested areas. It can increase the maximum trafficload by a factor of 20. Dual Packet Exit increases the exit bandwidth of the network leading to a50 percent reduction in worst-case latency and a 30 percent reduction inaverage latency as well as a lowered buffer usage. Priority Based Forced Requeue prematurely lifts out low priority packetsfrom the network to be requeued. Packets that have not yet entered the network compete with packets inside the network which gives tighter boundson admission with a reduction of worst case latencies by 50 percent. Furthermore, Operational Efficiency is proposed as a measure to quantifyhow effective a network is and is defined as the throughput per buffers used in the system. An increase of the injection of packets into the network to increase the system throughput will have a cost associated to it and can be optimised to save energy.
  •  
7.
  • Naeem, Abdul, 1976- (author)
  • Architecture Support and Scalability Analysis of Memory Consistency Models in Network-on-Chip based Systems
  • 2013
  • Doctoral thesis (other academic/artistic)abstract
    • The shared memory systems should support parallelization at the computation (multi-core), communication (Network-on-Chip, NoC) and memory architecture levels to exploit the potential performance benefits. These parallel systems supporting shared memory abstraction both in the general purpose and application specific domains are confronting the critical issue of memory consistency. The memory consistency issue arises due to the unconstrained memory operations which leads to the unexpected behavior of shared memory systems. The memory consistency models enforce ordering constraints on the memory operations for the expected behavior of the shared memory systems. The intuitive Sequential Consistency (SC) model enforces strict ordering constraints on the memory operations and does not take advantage of the system optimizations both in the hardware and software. Alternatively, the relaxed memory consistency models relax the ordering constraints on the memory operations and exploit these optimizations to enhance the system performance at the reasonable cost. The purpose of this thesis is twofold. First, the novel architecture supports are provided for the different memory consistency models like: SC, Total Store Ordering (TSO), Partial Store Ordering (PSO), Weak Consistency (WC), Release Consistency (RC) and Protected Release Consistency (PRC) in the NoC-based multi-core (McNoC) systems. The PRC model is proposed as an extension of the RC model which provides additional reordering and relaxation in the memory operations. Second, the scalability analysis of these memory consistency models is performed in the McNoC systems.The architecture supports for these different memory consistency models are provided in the McNoC platforms. Each configurable McNoC platform uses a packet-switched 2-D mesh NoC with deflection routing policy, distributed shared memory (DSM), distributed locks and customized processor interface. The memory consistency models/protocols are implemented in the customized processor interfaces which are developed to integrate the processors with the rest of the system. The realization schemes for the memory consistency models are based on a transaction counter and an an an address ddress ddress ddress ddress ddress ddress stack tacktack-based based based based based based novel approaches.approaches.approaches.approaches. approaches.approaches.approaches.approaches.approaches.approaches. The transaction counter is used in each node of the network to keep track of the outstanding memory operations issued by a processor in the system. The address stack is used in each node of the network to keep track of the addresses of the outstanding memory operations issued by a processor in the system. These hardware structures are used in the processor interface to enforce the required global orders under these different memory consistency models. The realization scheme of the PRC model in addition also uses acquire counter for further classification of the data operations as unprotected and protected operations.The scalability analysis of these different memory consistency models is performed on the basis of different workloads which are developed and mapped on the various sized networks. The scalability study is conducted in the McNoC systems with 1 to 64-cores with various applications using different problem sizes and traffic patterns. The performance metrics like execution time, performance, speedup, overhead and efficiency are evaluated as a function of the network size. The experiments are conducted both with the synthetic and application workloads. The experimental results under different application workloads show that the average execution time under the relaxed memory consistency models decreases relative to the SC model. The specific numbers are highly sensitive to the application and depend on how well it matches to the architectures. This study shows the performance improvement under the relaxed memory consistency models over the SC model that is dependent on the computation-to-communication ratio, traffic patterns, data-to-synchronization ratio and the problem size. The performance improvement of the PRC and RC models over the SC model tends to be higher than 50% as observed in the experiments, when the system is further scaled up.
  •  
8.
  • Raudvere, Tarvo, 1976- (author)
  • System Level Techniques for Verification and Synchronization after Local Design Refinements
  • 2007
  • Doctoral thesis (other academic/artistic)abstract
    • Today's advanced digital devices are enormously complex and incorporate many functions. In order to capture the system functionality and to be able to analyze the needs for a final implementation more efficiently, the entry point of the system development process is pushed to a higher level of abstraction. System level design methodologies describe the initial system model without considering lower level implementation details and the objective of the design development process is to introduce lower level details through design refinement. In practice this kind of refinement process may entail non-semantic-preserving changes in the system description, and introduce new behaviors in the system functionality. In spite of new behaviors, a model formed by the refinement may still satisfy the design constraints and to realize the expected system. Due to the size of the involved models and the huge abstraction gap, the direct verification of a detailed implementation model against the abstract system model is quite impossible. However, the verification task can be considerably simplified, if each refinement step and its local implications are verified separately. One main idea of the Formal System Design (ForSyDe) methodology is to break the design process into smaller refinement steps that can be individually understood, analyzed and verified. The topic of this thesis is the verification of refinement steps in ForSyDe and similar methodologies. It proposes verification attributes attached to each non-semantic-preserving transformation. The attributes include critical properties that have to be preserved by transformations. Verification properties are defined as temporal logic expressions and the actual verification is done with the SMV model checker. The mapping rules of ForSyDe models to the SMV language are provided. In addition to properties, the verification attributes include abstraction techniques to reduce the size of the models and to make verification tractable. For computation refinements, the author defines the polynomial abstraction technique, that addresses verification of DSP applications at a high abstraction level. Due to the size of models, predefined properties target only the local correctness of refined design blocks and the global influence has to be examined separately. In order to compensate the influence of temporal refinements, the thesis provides two novel synchronization techniques. The proposed verification and synchronization techniques have been applied to relevant applications in the computation area and to communication protocols.
  •  
9.
  • Shaoteng, Liu, 1984- (author)
  • New circuit switching techniques in on-chip networks
  • 2015
  • Doctoral thesis (other academic/artistic)abstract
    • Network on Chip (NoC) is proposed as a promising technology to address the communication challenges in deep sub-micron era. NoC brings network-based communication into the on-chip environment and tackles the problems like long wire complexities, bandwidth scaling and so on. After more than a decade's evolution and development, there are many NoC architectures and solutions available. Nevertheless, NoCs can be classi_ed into two categories: packet switched NoC and circuit switched NoC. In this thesis, targeting circuit switched NoC, we present our innovations and considerations on circuit switched NoCs in three areas, namely, connection setup method, time division multiplexing (TDM) technology and spatial division multiplexing (SDM) technology.Connection setup technique deeply inuences the architecture and performance of a circuit switched NoC, since circuit switched NoC requires to set up connections before launching data transfer. We propose a novel parallel probe based method for dynamic distributed connection setup. This setup method on one hand searches all the possible minimal paths in parallel. On the other hand, it also has a mechanism to reduce resource occupation during the path search process by reclaiming redundant paths. With this setup method, connections are more likely to be established because of the exploration on the path diversity.TDM based NoC constitutes a sub-category of circuit switched NoC. We propose a double time-wheel technique to facilitate a probe based connection setup in TDM NoCs. With this technique, path search algorithms used in connection setup are no longer limited to deterministic routing algorithms. Moreover, the hardware cost can be reduced, since setup requests and data flows can co-exist in one network. Apart from the double time-wheel technique for connection setup, we also propose a highway technique that can enhance the slot utilization during data transfer. This technique can accelerate the transfer of a data flow while maintaining the throughput guarantee and the packet order.SDM based NoC constitutes another sub-category of circuit switched NoC. SDM NoC can benefit from high clock frequency and simple synchronization efforts. To better support the dynamic connection setup in SDM NoCs, we design a single cycle allocator for channel allocation inside each router. This allocator can guarantee both strong fairness and maximal matching quality. We also build up a circuit switched NoC, which can support multiple channels and multiple networks, to study different ways of organizing channels and setting up connections. Finally, we make a comparison between circuit switched NoC and packet switched NoC. We show the strengths and weaknesses on each of them by analysis and evaluation.
  •  
10.
  • Zhu, Jun, 1976- (author)
  • Performance Analysis and Implementationof Predictable Streaming Applications onMultiprocessor Systems-on-Chip
  • 2010
  • Doctoral thesis (other academic/artistic)abstract
    • Driven by the increasing capacity of integrated circuits, multiprocessorsystems-on-chip (MPSoCs) are widely used in modern consumer electron-ics devices. In this thesis, the performance analysis and implementationmethodologies are explored to design predictable streaming applications onMPSoCs computing platforms. The application functionality and concur-rency are described in synchronous data flow (SDF) computational models,and two state-of-the-art architecture templates are adopted as multiproces-sor architectures, i.e., network-on-chip (NoC) based MPSoC and hybrid re-configurable CPU/FPGA platforms. Based on the author’s contributions onsimulation and formal analytical methods, performance analysis and designspace exploration for embedded MPSoCs architectures have been addressed. An energy efficient design space exploration flow is proposed for stream-ing applications with guaranteed throughput on NoC based MPSoCs, in whichboth application throughput analysis and system energy calculation are car-ried out by simulation on a multi-clocked synchronous modelling frame-work. On the other hand, based on event models of data streams, a formalanalytical scheduling framework for real-time streaming applications withminimal buffer requirement on hybrid CPU/FPGA architectures is exploited.The scheduling problem has been formalized declaratively by constraint basetechniques, and solved by a public domain constraint solver. Consecutively,the constraint based method has been extended to solve problems rangingfrom global computation/communication scheduling and reconfiguration anal-ysis to Pareto efficient design. Finally, a prototype of stream processing sys-tem on FPGA based MPSoC is built to substantiate the results from theoreti-cal studies in this thesis.
  •  
Skapa referenser, mejla, bekava och länka
  • Result 1-10 of 10

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view